{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# L2-B - Linear Quantization I: Get the Scale and Zero Point\n",
    "\n",
    "In this lesson, continue to learn about fundamentals of linear quantization, and implement your own Linear Quantizer."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run the next cell to import all of the functions you have used before in the previous lesson(s) of `Linear Quantization I` to follow along with the video.\n",
    "\n",
    "- To access the `helper.py` file, you can click `File --> Open...`, on the top left."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 199
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "\n",
    "from helper import linear_q_with_scale_and_zero_point, linear_dequantization, plot_quantization_errors\n",
    "\n",
    "### a dummy tensor to test the implementation\n",
    "test_tensor=torch.tensor(\n",
    "    [[191.6, -13.5, 728.6],\n",
    "     [92.14, 295.5,  -184],\n",
    "     [0,     684.6, 245.5]]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "dpKLMdCYvT_W"
   },
   "source": [
    "## Finding `Scale` and `Zero Point` for Quantization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 46
   },
   "outputs": [],
   "source": [
    "q_min = torch.iinfo(torch.int8).min\n",
    "q_max = torch.iinfo(torch.int8).max"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "q_min"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "q_max"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 46
   },
   "outputs": [],
   "source": [
    "# r_min = test_tensor.min()\n",
    "r_min = test_tensor.min().item()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "r_min"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "r_max = test_tensor.max().item()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "r_max"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "scale = (r_max - r_min) / (q_max - q_min)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "scale"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "zero_point = q_min - (r_min / scale)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "zero_point"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "zero_point = int(round(zero_point))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "zero_point"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Now, put all of this in a function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 352,
    "id": "l1IRM2eazm7o"
   },
   "outputs": [],
   "source": [
    "def get_q_scale_and_zero_point(tensor, dtype=torch.int8):\n",
    "    \n",
    "    q_min, q_max = torch.iinfo(dtype).min, torch.iinfo(dtype).max\n",
    "    r_min, r_max = tensor.min().item(), tensor.max().item()\n",
    "\n",
    "    scale = (r_max - r_min) / (q_max - q_min)\n",
    "\n",
    "    zero_point = q_min - (r_min / scale)\n",
    "\n",
    "    # clip the zero_point to fall in [quantized_min, quantized_max]\n",
    "    if zero_point < q_min:\n",
    "        zero_point = q_min\n",
    "    elif zero_point > q_max:\n",
    "        zero_point = q_max\n",
    "    else:\n",
    "        # round and cast to int\n",
    "        zero_point = int(round(zero_point))\n",
    "    \n",
    "    return scale, zero_point"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Test the implementation using the `test_tensor` defined earlier.\n",
    "```Python\n",
    "[[191.6, -13.5, 728.6],\n",
    " [92.14, 295.5,  -184],\n",
    " [0,     684.6, 245.5]]\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 46,
    "id": "Us4-c4nnR2Ln"
   },
   "outputs": [],
   "source": [
    "new_scale, new_zero_point = get_q_scale_and_zero_point(\n",
    "    test_tensor)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "height": 29,
    "id": "9XA1_C8OUTVv",
    "outputId": "63eedcd1-c56b-444f-8812-baa19c5dc642"
   },
   "outputs": [],
   "source": [
    "new_scale"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "height": 29,
    "id": "vC4QxI2jUUSe",
    "outputId": "5e8cde8e-6536-4db5-d1a8-b6660db4a755"
   },
   "outputs": [],
   "source": [
    "new_zero_point"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Quantization and Dequantization with Calculated `Scale` and `Zero Point`\n",
    "\n",
    "- Use the calculated `scale` and `zero_point` with the functions `linear_q_with_scale_and_zero_point` and `linear_dequantization`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 425
    },
    "height": 46,
    "id": "C9MtUUysR6oH",
    "outputId": "4d6d769c-8db8-4f1c-c9d8-a3393e937c73"
   },
   "outputs": [],
   "source": [
    "quantized_tensor = linear_q_with_scale_and_zero_point(\n",
    "    test_tensor, new_scale, new_zero_point)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 63
   },
   "outputs": [],
   "source": [
    "dequantized_tensor = linear_dequantization(quantized_tensor,\n",
    "                                           new_scale, new_zero_point)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Plot to see how the Quantization Error looks like after using calculated `scale` and `zero_point`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 46
   },
   "outputs": [],
   "source": [
    "plot_quantization_errors(test_tensor, quantized_tensor, \n",
    "                         dequantized_tensor)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "(dequantized_tensor-test_tensor).square().mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vYJxLMakItHX"
   },
   "source": [
    "### Put Everything Together: Your Own Linear Quantizer\n",
    "\n",
    "- Now, put everything togther to make your own Linear Quantizer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 199,
    "id": "wWJWsvcYTGQc"
   },
   "outputs": [],
   "source": [
    "def linear_quantization(tensor, dtype=torch.int8):\n",
    "    scale, zero_point = get_q_scale_and_zero_point(tensor, \n",
    "                                                   dtype=dtype)\n",
    "    \n",
    "    quantized_tensor = linear_q_with_scale_and_zero_point(tensor,\n",
    "                                                          scale, \n",
    "                                                          zero_point, \n",
    "                                                          dtype=dtype)\n",
    "    \n",
    "    return quantized_tensor, scale , zero_point"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qC0X9ux6JEmi"
   },
   "source": [
    "- Test your implementation on a random matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 425
    },
    "height": 29,
    "id": "QxWC2JPbIkS7",
    "outputId": "b583d611-c00c-4ac7-b2ae-10e83f2d295a"
   },
   "outputs": [],
   "source": [
    "r_tensor = torch.randn((4, 4))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note:** Since the values are random, what you see in the video might be different than what you will get."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "r_tensor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 46
   },
   "outputs": [],
   "source": [
    "quantized_tensor, scale, zero_point = linear_quantization(r_tensor)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "quantized_tensor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "scale"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "zero_point"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 46
   },
   "outputs": [],
   "source": [
    "dequantized_tensor = linear_dequantization(quantized_tensor,\n",
    "                                           scale, zero_point)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 46
   },
   "outputs": [],
   "source": [
    "plot_quantization_errors(r_tensor, quantized_tensor,\n",
    "                         dequantized_tensor)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "(dequantized_tensor-r_tensor).square().mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [
    "kd9Q2SD0MJGw",
    "E1palk4uRQX9",
    "oAexFpXiX1PW",
    "ChnEqFPYMn3p",
    "MFx2m7RmzRd5",
    "LbPjb9OOi0Xp",
    "6WopWDYWQr7X",
    "LYfTqh_VMTzT",
    "4YZP-9XTNkur",
    "dpKLMdCYvT_W",
    "2hoC5tcJznoI",
    "S68UGldKRnJc",
    "Y4Qfalu9vtHv",
    "WJN9IfVLTFNd",
    "qC0X9ux6JEmi",
    "AkwpMs-C5ccj",
    "EDpd5Te632KY",
    "JA1-rcLz4t4D",
    "kROAEGfdDsau",
    "oo4BCLpsDw3t",
    "h2gK-eALFc8U",
    "yDin7Rm6Dzqu",
    "X6J9ZiyHWzHa",
    "qUw1gQUu5yIe",
    "vGll7vBT6BGI",
    "Cl3AUuDuAH5w",
    "8NS1TnQt6E6v",
    "vcRy85lACotg"
   ],
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
